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A Framework for Reproducible, Interactive Research: 
Application to health and social sciences 
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Abstract 

The aim of this article is to introduce a reporting framework for reproducible, interactive research 
applied to Big Clinical Data, based on open source technologies. The framework is constituted by the 
following three axes: (i) data, (ii) analytical codes and (Hi) dissemination. In this paper, different 
documentation formats and online repositories are introduced. To integrate and manage the reproducible 
contents, we propose the R Language as the tool of choice. All the information is then published and 
gathered in a website for different projects. This framework is free and user friendly and is proposed to 
enhance reproducibility of health-science reports. 



1. Introduction 



With the growing amount of data in 
healthcare, the ability to analyze 
large datasets and report results ad- 
equately has become a key factor of research 
and innovation [IJ, which supports the creation 
of new technologies and improved clinical de- 
cision making. The increased complexity of 
these datasets brings together difficulties and 
new challenges in terms of data management. 



modeling and communication. Therefore, in- 
vestigators are now focusing on developing 
reproducible research protocols including en- 
tirely reproducible data analysis. It implies 
that the results reported in a publication can 
be immediately reproduced by granting access 
to both the datasets as well as the statistical 
and data mining scripts of the study 12J 

In order to make the information widely us- 
able, the value of data collection, analysis and 
communication as w^ell as the use of common 
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standards for sharing information have been 
recognized. In addition to increasing dissem- 
ination and better understanding of research 
findings, data sharing can also support confir- 
mation or refutation of research by allowing 
replication and increased transparency of re- 
sults [2,3] 

However, data sharing does bring some 
implementation challenges and possible risks. 
Potential invasion of participants privacy and 
breaking of patients confidentiality are primary 
concerns when making datasets public. Sec- 
ondly, adequate data management, academic 
and commercial primacy, and intellectual prop- 
erty rights as well as journal copyrights are 
factors to be careful with while publishing 
data |3|. In this context, the use of an ade- 
quate framework becomes essential to allowing 
reproducible research without compromising 
such aspects, specially when analyzing and 
reporting results from large datasets. 

Thus, the aim of this article is to introduce 
a simple reporting framework for reproducible, 
interactive research applied to health and social 
scienc. The framework is constituted by the 
following three axes: (i) data (Section IZTl, (ii) 
analytical codes (Section |2.3[ l and (iii) dissem- 
ination (Section |2.6^ . In this paper, different 
documentation formats and online repositories 
are introduced. To integrate and manage the 
reproducible contents, we propose the R Lan- 
guage as the tool of choice. All the information 
is then published and gathered in a website 
for different projects. This framework is free 
and user friendly and is proposed to enhance 
reproducibility of health-science reports. 

2. Reproducible Research 
Framework 

The framework proposed in this paper is based 
on the concept that an appropriate repro- 
ducible research report should allow one to 
totally reproduce the methods applied. Thus, 
we understand that besides making the analyt- 
ical data, code and figures available, an ade- 
quate reproducible research framework should 
integrate tools and features in a way that others 



could reach the same results and understand 
the process behind it. 

Therefore, in order to achieve an adequate 
integration between data, codes and outcomes 
(figures, tables, numerical results and others) in 
our framew^ork, w^e utilize the R Language [4J 
as the central tool. R has the ability of integrat- 
ing and managing different data formats, codes 
and formats. In addition, it allows communi- 
cation with several other analytical softwares 
such as SAS, Stata and SPSS EHIZl. 

2.1 Data formats 

The first issue about making a research proto- 
col reproducible is the data management pro- 
cess. There are several ways of storing data 
and many different data formats. In our per- 
spective, some of them are better by allowing 
integration with data analysis softwares and 
online repositories as w^ell as their ease of use. 
In the following sections, we demonstrate some 
of these formats w^e have been using and their 
integration with our reproducible Framework. 

2.1.1 Reproducible Data 

When making datasets publicly available, one 
must be concerned with the information that 
is going to be made public. In this context, the 
Health Insurance Portability and Accountabil- 
ity Act (HIPAA) developed a section on Pro- 
tected Health Information (PHI), which means 
that individually identifiable health informa- 
tion must be kept confidential when sharing 
data in healthcare. The complete list of PHI can 
be found at the Health and Human Services 
US Department LSJ 

Secondly, it is important to make sure that 
the data is coded with appropriate names that 
allows other people to read and understand the 
content easily. To make it easier, we strongly en- 
courage the publication of a complete and orga- 
nized data dictionary together with the dataset, 
containing variable labels, respective code, data 
characteristics (continuous, discrete, ordinal, 
dichotomous, etc.) and any other source of rel- 
evant information (e.g. length of Likert scale, 
categorization factors). 
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2.1.2 CSV 

Comma separated values (CSV) is a format 
readily available for consumption by any data 
analysis language or software. However, it 
does not provide a way to update the data 
once it is downloaded other than downloading 
the dataset again. In addition, the CSV format 
does not offer any security features. 

On the other hand, CSV files have one of 
the best usability experiences among all the 
formats and it can be easily integrated with 
R using online repositories (i.e. Google Drive 
or Dryad) through different R packages. One 
such package is the RCurl package [27J, which 
can integrate R with different HTML domains, 
among them a .csv spreadsheet from Google 
Docs. 

2.1.3 RDF, LOD and SPARQL endpoints 

Semantic Web technologies have recently be- 
come popular given the success provided by 
Linked Open Data (LOD) fTO]. The data is rep- 
resented with the help of the Resource Descrip- 
tion Framework (RDF) format, while data sets 
themselves are queried through the SPARQL 
(a recursive acronym for SPARQL Protocol and 
RDF Query Language). Main advantages in- 
clude the data availability 24/7 with automated 
updates and also the ability to dynamically 
merge across data sets sharing identical ele- 
ments (classes or instances). 

RDF data can be easily integrated with 
R Analytical codes through the RRDF pack- 
age IITTl . This package allows users to perform 
SPARQL queries inside R's workspace. In addi- 
tion to this package mentioned there is a whole 
set of tutorials and packages that can be used 
within R H. 

2.1.4 JSON 

JavaScript Object Notation (JSON) is consid- 
ered on of the best data-interchange formats. 
It is a text format with conventions familiar to 
several programming languages such as C++, 
Java, JavaScript and Python. 

More information and specifications about 



how to integrate JSON data with specific ap- 
plications can be found at [12]. It's connection 
with R analytical code is executed through the 
rjson package ||l3l which converts JSON objects 
into R objects. 

2.2 Data repositories 

After deciding a format of data to be used, it is 
also mandatory to use an online repository to 
store the data and integrate it with the analyti- 
cal codes (discussed later in Section [23} . In the 
following sections, we present some options of 
free repositories that are used by our group. 

2.2.1 Dryad 

Dryad IIT4l is an international repository spe- 
cific for data related to scientific publications. 
It allows data to be deposited easily and readily 
provides the citation related to the respective 
publication. Dryad can be integrated with R, 
thus improving interoperability I.15J . 

2.2.2 Figshare 

Figshare [16 1 is an online repository, similar 
to Dryad, that allows researchers to choose a 
publication with the ability to be cited within 
the paper. 

Additionally, Figshare supports not only 
data but also other types of research outputs 
such as figures, datasets, media files, papers, 
posters or even file sets with different types 
of documents. A major advantage of Figshare 
is its ability of easily sharing and discovering 
information about different research projects. 
We have used Figshare to publish datasets (in 
.CSV formats) as w^ell as figures. Examples can 
be found in IITtIITsI . In addition, Figshare can 
also be integrated with R through some pack- 
ages [191 . 

2.2.3 Google drive 

Google Drive is another online repository 
which facilitates collaboration and sharing of 
files [20 J . This application from Google inte- 
grates texts, spreadsheets, presentations and 
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other editors from Google (i.e. Google Docs, 
Google Sheets, Google Forms and others) and 
also allows the user to store forms, drawing, 
and different types of files in the cloud. 

Google Drive is extensively used to share 
data, codes and other outputs among re- 
searchers in our group 1|211 . One example to 
connect data stored in Google Drive with R is 
the RCurl package . This package allows users 
to compose general HTTP requests and call 
URLs and other web formats, such as datasets 
in .CSV format. Another way is to simply open 
the files stored in Google Drive (Spreadsheets 
or R-Scripts, for example) inside R, through 
RStudio [22J. 

In addition, we also use Google Drive as 
a way to integrate and facilitate collaborative 
writing and coding in R, since this approach 
has been found more user friendly to content 
researchers than other more sofisticated reposi- 
tories. 



mining and management, and integration with 
other softwares and programming languages. 

This collaborative ability, besides making 
R a powerful analytical environment, makes it 
assume a position in our framew^ork as a glue 
for other languages and technologies such as 
Python, Java, relational databases, RDF, C, C++, 
Weka, among many others. This way we can 
gather data and data storage tools, analytical 
coding and repositories for outputs, making a 
research project fully reproducible. In addit- 
tion, R has being used by a large community, 
and has a lot of references to lean on. 

In our group we opt to run R through RStu- 
dio II22I . This platform is also open source and 
is an integrated environment that helps to vi- 
sualize the different R interfaces (workspace, 
graph, scripts and log). Other than that, it fa- 
cilitates the management of multiple working 
directories through the definitions of projects. 



2.3 Analytical scripts 

Publishing analytical codes is an important 
step in a reproducible framework besides the 
connection between the codes and the data. 
Therefore, we demonstrate here the different 
software that can be used to generate, publish 
and manage the analytical codes. 

2.3.1 R Language Statistical Software 

As mentioned before, R [4] is the central tool 
of our reproducible research framework. As 
a definition, R is an open source software for 
statistical analysis and graphic creation. It has 
been developed by a vast commim^ity of collab- 
orators from several countries and institutions. 
Although R is not superior to other statisti- 
cal softwares in every aspects (such as intuitive 
GUI interface, or pre-defined operations), it 
gathers qualities which makes it a better op- 
tion to our framework than other statistical 
environments. One major advantage of R is its 
collaborative function in the development of 
packages. R has a huge library [4J (Comprehen- 
sive R Archive Netw^ork - CRAN) of packages 
for statistical analysis, graphic creation, data 



2.3.2 Reproducible Scripts 

As suggested by Hadley in his github reposi- 
tory 1 23 1, the idea is to create a code that can be 
recreated just by copying the codes we publish 
online. Therefore, each code must be connected 
to the dataset and contain all the information 
needed to be performed. 

The elements of a reproducible script in R 
include the required packages, connection to 
the data, codes and codes descriptions. Each 
function in R is called upon a package where 
it is nested. So, for anyone else to be able to 
reproduce our codes, she must have all the 
packages installed. 

Regarding the data, we have already dis- 
cussed earlier the possible formats and ways to 
publish it. It is notew^orthy that the data must 
be aligned with the codes. This means that all 
the variables must be named exactly with the 
names used in the codes. Also, every data man- 
agement information must be inserted in the 
codes so that whoever is trying to reproduce 
it might reach the same results. Finally, each 
line must have a description of its purpose and 
use. 



Framework for Reproducible Research • April 2013 



2.3.3 Github 

Github ||24|| is an online repository built to fa- 
cilitate the collaborative writing of computing 
codes. It not only allows the sharing of codes 
but facilitates collaboration through the copy 
(hereby called "fork") of project pages in a safer 
way, regarding the original code. Among all 
the qualities of using Github as a reproducible 
strategy in the analytical coding process, we 
highlight its strong connectivity with R. It al- 
lows not only the sharing and management of 
codes in websites, but also simulates R outputs 
with kntir 1 25 1 . 

There are several possibilities of using R 
integrated with Github. We have been using 
Github mainly to: 

• publish analytical R-Scripts 

• promote collaboration among our data 
analysts when creating or debugging 
data analysis 

• generate automatic data reports for open 
design projects 

• create templates for data analysis (hereby 
called data analysis toolbox) with expla- 
nation of the methods (using wiki pages) 
and description of codes and outcomes. 

2.4 Dynamic research 

In order to have a complete reproducible script 
and also to facilitate data dissemination and 
visualization, it is important to obtain auto- 
mated and dynamic representation of tables, 
figures and reports. R allows the creation of 
analytical codes that generates automated re- 
ports, such as the knitr package ||25| , which 
translates the analysis into an HTML report 
(or other formats such a PDF). In summary, 
this package translates the code into a report 
mixing Latex and markdown languages. An 
example of its application can be found in our 
Github repository [26] for the, Glocal Open De- 
sign Collection project. In this specific project 
we used knitr associated with a R code to gen- 
erate an automated report about data quality 
and associations. 



Another way of using R to generate dy- 
namic research is by developing interactive 
graphs. These are graphs that might be cus- 
tomized or modified by the user (research 
subject, patient or any other stakeholder) to 
get different slices of the dataset. R has sev- 
eral ways of generating interactive graphs. 
Fiere, v^e would like to introduce rggobi and 
Shiny Il27l|28l l29|. However, there are options 
that can be found at the CRAN task view for 
dynamic graphs Il30] 

2.5 Licensing 

Since all the documentation we are using is 
going to be made public we need to assure 
that its use is covered by a license. This will 
assure that any use other than that allowed by 
the license, is not performed by the users. This 
is fairly important due to the relevance of the 
information being made public. In our frame- 
work, we have used Creative Commons, which 
is a free copyright license framew^ork fSTI. 

Inserting a line regarding the licensing char- 
acteristics in each of the documents in a project 
is sufficient to specify the type of license. The 
licensing assures the need for approval from 
the copyright owner. Basically, we allow the 
user to share and adapt the specific parts of 
the project. The only restriction is that the user 
must attribute the documents to the original 
authors and must use it only for noncommer- 
cial purposes. 

Examples of licenses are: This code is li- 
censed imder a Creative Commons Attribution 

- Noncommercial 3.0 Unported License. You 
are free: to Share - to copy, distribute and trans- 
mit the work, to Remix - to adapt the work, 
under the following conditions: Attribution 

- You must attribute the work in the manner 
specified by the author or licensor (but not in 
any way that suggests that they endorse you 
or your use of the work). Noncommercial - 
You may not use this work for commercial pur- 
poses. With the understanding that: Waiver - 
Any of the above conditions can be waived if 
you get permission from the copyright holder. 
Public Domain - Where the w^ork or any of its 
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elements is in the public domain under appli- 
cable law, that status is in no way affected by 
the license. Other Rights - In no way are any 
of the foUowrng rights affected by the license: 
Your fair dealing or fair use rights, or other ap- 
plicable copyright exceptions and limitations; 
The author's moral rights; Rights other per- 
sons may have either in the work itself or in 
how the work is used, such as publicity or pri- 
vacy rights. Notice - For any reuse or distribu- 
tion, you must make clear to others the license 
terms of this work. The best way to do this is 
with a link to this w^eb page. For more details 



see http: //creativecommons . org/licenses/| 
by-nc/3 .0/11511 



2.6 Data Dissemination and Com- 
munication 

Other than discussing methods and tools to 
make a research reproducible, we also believe 
that it is important to include a facilitation 
of the data communication and dissemination. 
This will not only allow users to access the re- 
search project but will also catalyze the reach 
and dissemination of the respective projects. 

In order to disclose and gather all the ma- 
terial from our groups' research projects that 
was made public, we created websites (using 
Google Sites [32 J) for each of the projects where 
we included links to data repositories, code 
repositories and inserted reports and graphs. 
Any web design tool can be used but our choice 
for Google Sites is based on its free access 
and user friendly interface. An example is 
the Observer Agreement website which inte- 
grates all the reproducible documentation for 
our researchers with observer agreement about 
orthopedic scales projects ||33| . 

2.7 Overall workflow 

Summaryzing the information discussed, we 
created a simple graphical demonstration of 
the framework's conception (Figure 1). As men- 
tioned before, R Languge software gets a high- 
lighted position in the framew^ork's model. So 
R is used to manage and coordinate the docu- 



mentation. Data is stored in open access online 
repositories,in a R supported formats that will 
allow the connection between data and analyt- 
ical code. The analytical codes are developed 
within R interface and stored in a open access 
online repositiry. Outputs generated by the 
codes are also stored in open sourced online 
repositores. All this information is licensed 
and integrated in a website for the research 
project. 

3. Discussion 

In this study we aim to introduce a report- 
ing framework for reproducible and interac- 
tive research, based on technologies and meth- 
ods applied to some of the recent projects in 
our research group (RoR). Several tools were 
described to publish datasets and analytical 
codes, all centered and managed by the R Lan- 
guage Softw^are. 

The concept of our framework was ini- 
tially based on some guidelines already pub- 
lished [34. 35| . Not many reports can be found 
in the literature on the use of reproducible 
research framework in healthcare. Some re- 
searchers do publish their datasets or codes, 
but generally they are published separately. In 
our framework w^e tried to approach different 
aspects of reproducibility, rather than just the 
data, that is connectivity [36J, dissemination 
and licensing with a framew^ork that is consti- 
tuted of free and friendly technologies, facili- 
tating replication and improving transparency of 
results. 

Some of the tools w^e showcased here have 
been extensively discussed and used for the 
development of projects by many investigators. 
Github, for instance, has been extremely used 
by data analysts and programmers, as well 
as Dyrad and Figshare, given the increased 
amount of data being stored in clouds. How- 
ever, this advances have not been observed as 
often in healthcare research, specifically when 
it comes to Big Clincal Data and replication of 
health researches protocols |3|. 

Although our proposed framework is still 
in progress and needs to be improved, w^e em- 
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Figure 1: Depiction of the reproducible research framework. 



phasize its ability not only for sharing data 
and codes in a safe way, but also connecting 
and disseminating information through free 
and user friendly technologies. We believe that 
only by sharing and comparing methods a con- 
sensus of framework can be created. Therefore, 
this model proposed can help towards the stan- 
dardization of reproducible research protocols 
in healthcare, aggregating value not only for 
research, but also for innovation and clinical 
practice. 



References 

[1] Maniyka J, Chui M, Brown B, Bughun 
J, Dobbs R, Roxburgh C, Byers AH 
(20 11). Big data: The next frontier for in- 
novation competition, and productivity. 
McKinsey and Company. Available: http : 



|//www. mckinsey.com/insights/ 
[business_technology/big_data_ 
ithe_next_f rontier_ f or_innovation| 
Accessed 18 April 2013. 

[2] Peng RD, Dominici F, Zeger SL (2006). 
Reproducible epidemiologic research. 
Am J Epidemiol. May l;163(9):783-9. 



[3] Groves T, Godlee F (2012). Open sci- 
ence and reproducible research. BMJ. Jun 
26;344:e4383. Available: |doi:10.11367 



|bmj ■ e4383 . [ Accessed 18 April 2013. 

[4] R-project contributors. The R Project for 
Statistical Computing. Available: wwwj 



r-project. org, 2013 Accessed 18 April 



2013. 

[5] IBM. SPSS Software .Available: 
http: //www-01 . ibm. com/software/ 1 
analytics/spss. Accessed 18 April 



2013. 

[6] SAS Institute Inc. Statistical Analysis Sys- 
tem - SAS. Available: http : / /www . sas"| 
com Accessed 18 April 2013. 

[7] StataCorp LPStata: Data analisys and 
statistical software. Available: llwww I 
istata. com Accessed 18 April 2013. 

[8] Health Information Privacy and Secu- 
rity (HIPPA).Available: http : //www . hhs | 
[gov/ocr/ privacy/hipaa Accessed 18 
April 2013. 



[9] Lang DT 
Available: 



(2013). Package 'RCurl'. 



http: //cran. r-project 
org/web/packages/RCurl/RCurl . 



I^df Accessed 18 April 2013. 



Framework for Reproducible Research • April 2013 



[10] Linked Data. 

7/linkeddata.org 



Available jhttp^ 
Accessed 18 April 



2013. 



[11] Willighagen E (2013). Package 'rrdf . 
Available: http : //cran.r-project, 



|org/web/packag e s/rrdf /rrdf .pdf 
Accessed 18 April 2013. 



[12] JavaScript Object Notation (JSON). Avail- 
able: www.json.org. Accessed 18 April 
2013. 

[13] Couture-Beil A (2013). Package 'rjson'. 
Available: |http://crcin.r-project 



org/web/packages/rjson/rjson.pdf 
Accessed 18 April 2013. 

[14] Dryad Digital Repository.Available: 
jdatadryad.org Accessed 18 April 2013. 



[15] Chamberlain S, Boettiger C, 
K (2013). Package 'rdryad'. 
able: 



Ram 
Avail- 



http : //crcLa . r-pro j ect . org/| 



'web/packages/rdryad/rdryad.pdf 
Accessed 18 April 2013. 



[16] Figshare.Available: http: //f igshare 
com/ Accessed 18 April 2013. 

[17] Moreira T, Yen T, Vissoci JRN, Barros T, 
Ejnisman L, Massa B, Pietrobon R, Vail 
TP (2013).Total Hip Arthoplasty Compli- 
cations Prevalence Meta-analysis at 5, 15 
and 20 years foUowup. Available: http : 
|//dx ■ doi ■ org/ 10 . 6084/m9| Accessed 18 
Aprfl2013. 

[18] Dal Ponte T, Pessin DV, Gambeta CE, Fer- 
reira APB, Braga L, Vissoci JRN, Braga- 
Baiak A, Gandhi M, Pietrobon R. (2013) 
The reliability of AO classification on 
femur fractures among orthopedic res- 
idents. Available: http : //dx . doi .org/l 



10 ■ 6084/m9l Accessed 18 April 2013. 



[19] Boetinger C, Chamberlain S, Ram K, Hart 
E. (2012). Package 'rfigshare'. Available: 
http: //crcin. r-pro j ect . org/web/ I 



packages/rfigshare/rf igshare .pdf 
Accessed 18 April 2013^ 



[20] Google Drive. Available: Ihttps : //drive 



google ■ com/[ Accessed 18 April 2013. 



[21] Research On Research and 
Innovation (ROR). Available: 

https : //sites . google . com/site/ "[ 



Ac- 



researchonresearchtech/home 
cessed 18 April 2013. 



[22] RStudio Inc (2013). Available: jww 
[rstudio ■ com[ Accessed 18 April 2013. 



[23] Wickham 
ducibility. 



H (2013). 

Available: 



//github . com/hadley/devtools/ 



Repro 

[https' 



wiki/Reproducibility 



Accessed 18 



April 2013. 



[24] Github. Available: jhttps : //github 



com/| Accessed 18 April 2013. 



[25] Xie Y. (2013). knitr: Ageneral-purpose 
package for djniamic report in R. Avail- 
able: ,https : / /github . com/hadley/| 



devtools/wiki/Reproducibility 



Accessed 18 April 2013. 

[26] Glocal Registry project in Github. Avail- 
able: Ihttps : //github . com/rpietro/| 
GlocalRegistryi Accessed 18 April 



2013. 

[27] Lang DT, Swayne D, Wickham H, 
Lawrence M. (2012). rggobi: Interface 
between R and GGobi. Available: 
http://crcLn.r-project.org/web/ T 
[packages/rggobi/ index . html Ac- 

cessed 18 April 2013. 

[28] Adler D, Murdoch, D. (2013). 3D Visu- 
alization package (OpenGL). Available: 
Ihttp: //cran.r-project . org/web/ H 
packages /rgl/index . html_ Accessed 18 
April 2013. 

[29] RStudio Lie. (2013) Shiny: web appli- 
cations framework for R. Available: 
http://crcin.r-project.org/web/ 1 
packages /shi ny/ index . html, Accessed 
18 April 2013. 



Framework for Reproducible Research • April 2013 



[30] Lewin-Koh N. (2013) CRAN Task View: 
Graphic Displays and Dynamic Graphics 
and Graphic Devices and Visualization. 
Available: |http://crcLn.r-project 



'org/web/views/Graphics .html 
cessed 18 April 2013. 



Ac- 



[31] Creative Commons. Available: http:' 
//creat ivecommons . org/ Accessed 18 
April 2013. 

[32] Google Inc. Google Sites. Available: 
https : //sites . goog le . coin7?pli=l[ 
Accessed 18 April 2013. 

[33] Observer Agreement. Available: 
https : //sites . google . com/site/ | 



|observeragreement/home Accessed 18 
April 2013. 

[34] Larne C, Goodman SN, Griswold ME, 
Sox HC. (2007) Reproducible research: 
moving toward research the public 
can really trust. Ann Intern Med. Mar 
20;146(6):450-3. 

[35] Peng RD. (2009) Reproducible research 
and Biostatistics. Oxford Journals. Bio- 
statistics. VolumelO, Issue pp 405-408. 

[36] Peng RD. (2011) Reproducible Research 
in Computational Science. Science 2 De- 
cember 334 (6060):1226-1227 DOI: lo] 
ril26/science . 1213847. 



